[pull] master from ruby:master#996
Merged
Merged
Conversation
This is a debug mode in Ruby where an extra word is used after each object to store the address of the Ractor that owns the object, used for debug purposes only. While we're working on Ractors, we also need to be able to test with MMTk enabled, so we should introduce support for this to the MMTk binding as well. As implemented we'll default the binding options to have everything disabled and hardcoded to 0, as was always the case, but if RACTOR_CHECK_MODE is enabled, we'll build and pass a valid RubyBinding object to MMTk. ruby/mmtk@83cb291313
the only caller of this unconditionally constructs a binding options object now, So actually this is dead code ruby/mmtk@d832004e89
* numeric: emit two decimal digits per iteration in rb_fix2str
Replace the digit-at-a-time loop in rb_fix2str with the standard
itoa 2-digit lookup table for base 10. Each iteration now
writes two digits using a single (u % 100, u / 100) pair, so the
number of loop iterations is halved for multi-digit integers.
The classic per-digit loop is kept for non-base-10 conversion.
Benchmark (Apple M-series, 5M-10M ops, best of 3 runs):
case base patch delta
--------- ----- ----- -----
1-digit (5) 64 ns/op 64 ns/op -0%
2-digit (42) 64 ns/op 65 ns/op +2% (noise)
3-digit (400) 66 ns/op 64 ns/op -3%
5-digit (12345) 69 ns/op 67 ns/op -3%
10-digit (1234567890) 77 ns/op 67 ns/op -13%
19-digit (2^62-1) 111 ns/op 75 ns/op -33%
The crossover is at ~3 digits: below that the constant setup
dominates and the benefit is within noise, above that the halved
iteration count shows up linearly. Typical Rails payloads mix
short IDs (1-5 digits) and longer values (timestamps, nanos,
large counts), so the win is workload-dependent but strictly
non-negative for real code.
Correctness: 100k random fuzz across the full fixnum range plus
targeted edges (0, ±1, ±99, ±100, 2^30-1, 2^62-1, etc.) all pass.
make test-all shows 34694 tests, 7325860 assertions, 0 new
failures (same pre-existing TestArgf#test_puts flake as on
master) — test_integer.rb alone runs 38 tests / 421628 assertions
of which Integer#to_s exercises the bulk, all pass.
The 200-byte lookup table sits in .rodata and fits in a single
cache line of its own (3 lines for the whole table). No change
to public API, no change to bignum conversion, no change to
non-base-10 conversion paths.
* bignum: emit two decimal digits per iteration in big2str_2bdigits
Extend the 2-digit lookup-table itoa optimisation from rb_fix2str to
the inner conversion loop used by Bignum#to_s. big2str_2bdigits has
two code paths — a leading-chunk path that emits variable-length
digits, and a recursive-chunk path that emits a fixed-width zero-
padded block — and both gain from the halved division count. The
classic per-digit loop is preserved for non-base-10 conversion.
Moves the ruby_decimal_digit_pairs table from a file-static in
numeric.c to bignum.c next to ruby_digitmap, and exposes it through
internal/bignum.h so both files share the same 200-byte .rodata
instance.
Benchmark (Apple M-series, best of 3 runs, measures bignum-only
speedup against the preceding fixnum commit):
case base patch delta
--------- ----- ----- -----
big_20dig 10^19+... 146 ns/op 124 ns/op -15%
big_40dig 10^39+... 174 ns/op 152 ns/op -13%
big_100dig 10^99+42 236 ns/op 213 ns/op -10%
big_500dig 10^499+7 1119 ns/op 1086 ns/op -3%
big_1000dig 10^999 3490 ns/op 3459 ns/op -1%
fix_19dig 2^62-1 76 ns/op 76 ns/op 0% (unchanged path)
Wins concentrate in the 20-100 digit range where big2str_2bdigits
is the dominant cost. Above ~500 digits the Karatsuba divmod
recursion dominates and the digit-emission saving shrinks to the
noise floor. The 20-100 range is what actual Ruby code exercises
(financial high-precision sums, nanosecond timestamps, large
counters); crypto-size (1000+ digit) bignums are rare in to_s paths.
Correctness: 100k random fixnum fuzz unchanged, 500 random bignum
fuzz up to 2^256 with cross-check against sprintf("%d"), bases
2/8/16/36 round-trip, plus edge cases (0, just-above-fixnum, ±2^100,
20-digit strings near the fixnum boundary). test/ruby/test_integer.rb
stays at 38 tests / 421628 assertions / 0 failures, test_bignum.rb
passes 74 / 607 / 0 failures, full make test-all reports 34694
tests / 0 new failures (same TestArgf#test_puts pre-existing flake
as master).
* benchmark: add int_to_s yaml for Integer#to_s
Reproducible benchmark for the two preceding commits. Covers:
- 1/2/3/5/10/19-digit positive fixnums (spans the break-even point
and the two large-number wins at the top)
- A negative fixnum (exercises the minus-sign prepend path)
- 20/40/100-digit bignums (spans the big2str_2bdigits win range)
- Two string-interpolation scenarios, so reviewers can see how much
of the Integer#to_s speedup reaches real code that allocates the
result string too
Intended to be consumed by benchmark-driver against master vs
int-to-s-twodigit for A/B comparison. Matches the numbers in the
commit messages of 5bfb7e0 and c5df6de.
---------
Co-authored-by: tomoya ishida <tomoyapenguin@gmail.com>
Over time the .gdbinit initializer has drifted from the codebase and the rb_ps helper no longer works. This PR fixes it. The changes that caused it to break were: * 226f370 renamed cfp->iseq to cfp->_iseq. * 6c24904 switched from storing the last_id to storing the next_id. * f7ae32e removed ID_ENTRY_SIZE.
ZJIT: Use for_each_operand_mut in Function::find No need to repeat this matching logic manually.
utf-8 is the default for source files but can be overwritten via options ruby/prism@355f451528
… ripper translator When no magic encoding comment is present, it does not default to utf-8, and takes the encoding of the string that contains the source code instead. Most of the time that will be utf-8, but not always. ruby/prism@1a273db780
These are internal-only helpers which can be used instead of the RMATCH_REGS struct directly. RMATCH_REGS is just a pointer offset from the RMatch VALUE itself, so this should not significantly affect codegen. The motivation for this is that it's both simpler, and should move us towards being able to replace the storage for RMATCH, and to be able to store the positions embedded instead of in separate malloc memory.
Unreachable instructions terminate blocks. We'll use this mostly for testing as a terminator instruction (since traditional BB's will require all blocks to end with a terminator)
The event hook may use the EC and it will be null when it is running from a GC thread.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )